Communicative patterns in Romanian workplace written texts
نویسندگان
چکیده
منابع مشابه
Diacritics Restoration in Romanian Texts
There are several languages that use diacritical characters outside the ASCII charset. For some of the languages, most diacritical characters can be deterministically recovered but in general, this is not the prevailing case. However, the difficulty of the task differs from language to language depending on the functional role of the diacritical characters. For Romanian, automatic restoration o...
متن کاملAutomatic Diacritics Insertion in Romanian Texts
The problem of automatic insertion of diacritics into an electronic text is well justified for several languages. Even if the diacritical characters concerned are present in the extended 8-bit ASCII charset (as the case is with French) any 7-bit filtering transmission of such a text will corrupt it. The situation is even worse when the diacritic characters are not in the 8-bit ASCII charset. To...
متن کاملAutomatic Structuring of Written Texts
This paper deals with automatic structuring and sentence boundary labelling in natural language texts. We describe the implemented structure tagging algorithm and heuristic rules that are used for automatic or semiautomatic labelling. Inside the detected sentence the algorithm performs a decomposition to clauses and then marks the parts of text which do not form a sentence, i.e. headings, signa...
متن کاملTemporal classification for historical Romanian texts
In this paper we look at a task at border of natural language processing, historical linguistics and the study of language development, namely that of identifying the time when a text was written. We use machine learning classification using lexical, word ending and dictionary-based features, with linear support vector machines and random forests. We find that lexical features are the most help...
متن کاملOn Classifying Coherent/Incoherent Romanian Short Texts
In this paper we present and discuss the results of a text coherence experiment performed on a small corpus of Romanian text from a number of alternative high school manuals. During the last 10 years, an abundance of alternative manuals for high school was produced and distributed in Romania. Due to the large amount of material and to the relative short time in which it was produced, the questi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Revista signos
سال: 2010
ISSN: 0718-0934
DOI: 10.4067/s0718-09342010000500005